Overview
Brought to you by YData
Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 11401 |
| Missing cells | 46890 |
| Missing cells (%) | 16.5% |
| Total size in memory | 2.0 MiB |
| Average record size in memory | 186.0 B |
Variable types
| Numeric | 7 |
|---|---|
| Text | 16 |
| Boolean | 2 |
IS_PLATFORM_USER has constant value "False" | Constant |
IS_BAD_USER is highly imbalanced (99.6%) | Imbalance |
STARTDATE has 1865 (16.4%) missing values | Missing |
ENDDATE has 2059 (18.1%) missing values | Missing |
DEGREE_RAW has 1941 (17.0%) missing values | Missing |
FIELD_RAW has 2937 (25.8%) missing values | Missing |
RSID has 1088 (9.5%) missing values | Missing |
ULTIMATE_PARENT_RSID has 1119 (9.8%) missing values | Missing |
SCHOOL_PRESTIGE has 3022 (26.5%) missing values | Missing |
CAMPUS_COUNTRY has 2576 (22.6%) missing values | Missing |
LOCATION_COUNTRY has 6084 (53.4%) missing values | Missing |
UNIVERSITY_COUNTRY has 6498 (57.0%) missing values | Missing |
UNIVERSITYURL has 2104 (18.5%) missing values | Missing |
UNIVERSITYURI has 2065 (18.1%) missing values | Missing |
UNIVERSITY_LOCATION has 6076 (53.3%) missing values | Missing |
EDUCATION_DESCRIPTION has 7453 (65.4%) missing values | Missing |
DEGREE_LEVEL has 3768 (33.0%) zeros | Zeros |
Reproduction
| Analysis started | 2025-09-30 07:18:54.664685 |
|---|---|
| Analysis finished | 2025-09-30 07:18:55.696919 |
| Duration | 1.03 second |
| Software version | ydata-profiling vv4.17.0 |
| Download configuration | config.json |
Variables
USER_ID
Real number (ℝ)
| Distinct | 4521 |
|---|---|
| Distinct (%) | 39.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 577473961.9 |
| Minimum | 1241390 |
|---|---|
| Maximum | 2225886059 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 89.2 KiB |
Quantile statistics
| Minimum | 1241390 |
|---|---|
| 5-th percentile | 38152157 |
| Q1 | 214379847 |
| median | 437381605 |
| Q3 | 677072734 |
| 95-th percentile | 2061401536 |
| Maximum | 2225886059 |
| Range | 2224644669 |
| Interquartile range (IQR) | 462692887 |
Descriptive statistics
| Standard deviation | 554367505.1 |
|---|---|
| Coefficient of variation (CV) | 0.9599870153 |
| Kurtosis | 2.504441975 |
| Mean | 577473961.9 |
| Median Absolute Deviation (MAD) | 233244069 |
| Skewness | 1.779461698 |
| Sum | 6.58378064 × 1012 |
| Variance | 3.073233308 × 1017 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 280043285 | 39 | 0.3% |
| 1061941613 | 19 | 0.2% |
| 137321879 | 13 | 0.1% |
| 889644298 | 13 | 0.1% |
| 2080840017 | 13 | 0.1% |
| 1056527074 | 13 | 0.1% |
| 47154339 | 12 | 0.1% |
| 443026918 | 11 | 0.1% |
| 37340671 | 11 | 0.1% |
| 382026552 | 11 | 0.1% |
| Other values (4511) | 11246 |
| Value | Count | Frequency (%) |
| 1241390 | 1 | < 0.1% |
| 1249886 | 3 | |
| 1284328 | 3 | |
| 1490244 | 4 | |
| 1511359 | 2 |
| Value | Count | Frequency (%) |
| 2225886059 | 3 | |
| 2225747644 | 2 | |
| 2225599636 | 3 | |
| 2225251020 | 3 | |
| 2224246512 | 1 | < 0.1% |
SCHOOL
Text
| Distinct | 4284 |
|---|---|
| Distinct (%) | 37.6% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Memory size | 89.2 KiB |
Length
| Max length | 132 |
|---|---|
| Median length | 87 |
| Mean length | 27.54508772 |
| Min length | 1 |
Unique
| Unique | 2981 ? |
|---|---|
| Unique (%) | 26.1% |
Sample
| 1st row | East Carolina University |
|---|---|
| 2nd row | The State University of New York at Canton |
| 3rd row | The University of Tennessee Health Science Center |
| 4th row | University of Nebraska Medical Center |
| 5th row | Northwestern University |
| Value | Count | Frequency (%) |
| university | 5977 | 13.6% |
| of | 4118 | 9.4% |
| school | 2758 | 6.3% |
| college | 1652 | 3.8% |
| new | 1000 | 2.3% |
| the | 971 | 2.2% |
| high | 806 | 1.8% |
| york | 748 | 1.7% |
| 582 | 1.3% | |
| institute | 567 | 1.3% |
| Other values (4277) | 24747 |
Most occurring characters
| Value | Count | Frequency (%) |
| 32548 | 10.4% | |
| e | 26341 | 8.4% |
| i | 25469 | 8.1% |
| o | 23091 | 7.4% |
| n | 20525 | 6.5% |
| t | 17476 | 5.6% |
| r | 16554 | 5.3% |
| a | 15346 | 4.9% |
| s | 14818 | 4.7% |
| l | 13468 | 4.3% |
| Other values (119) | 108378 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 314014 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 32548 | 10.4% | |
| e | 26341 | 8.4% |
| i | 25469 | 8.1% |
| o | 23091 | 7.4% |
| n | 20525 | 6.5% |
| t | 17476 | 5.6% |
| r | 16554 | 5.3% |
| a | 15346 | 4.9% |
| s | 14818 | 4.7% |
| l | 13468 | 4.3% |
| Other values (119) | 108378 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 314014 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 32548 | 10.4% | |
| e | 26341 | 8.4% |
| i | 25469 | 8.1% |
| o | 23091 | 7.4% |
| n | 20525 | 6.5% |
| t | 17476 | 5.6% |
| r | 16554 | 5.3% |
| a | 15346 | 4.9% |
| s | 14818 | 4.7% |
| l | 13468 | 4.3% |
| Other values (119) | 108378 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 314014 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 32548 | 10.4% | |
| e | 26341 | 8.4% |
| i | 25469 | 8.1% |
| o | 23091 | 7.4% |
| n | 20525 | 6.5% |
| t | 17476 | 5.6% |
| r | 16554 | 5.3% |
| a | 15346 | 4.9% |
| s | 14818 | 4.7% |
| l | 13468 | 4.3% |
| Other values (119) | 108378 |
STARTDATE
Text
Missing
| Distinct | 110 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 1865 |
| Missing (%) | 16.4% |
| Memory size | 89.2 KiB |
Length
| Max length | 9 |
|---|---|
| Median length | 8 |
| Mean length | 8.000314597 |
| Min length | 8 |
Unique
| Unique | 34 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | 2023/1/1 |
|---|---|
| 2nd row | 2016/1/1 |
| 3rd row | 2013/1/1 |
| 4th row | 2001/1/1 |
| 5th row | 1994/1/1 |
| Value | Count | Frequency (%) |
| 2016/1/1 | 497 | 5.2% |
| 2017/1/1 | 487 | 5.1% |
| 2015/1/1 | 478 | 5.0% |
| 2014/1/1 | 466 | 4.9% |
| 2018/1/1 | 437 | 4.6% |
| 2012/1/1 | 423 | 4.4% |
| 2013/1/1 | 417 | 4.4% |
| 2019/1/1 | 414 | 4.3% |
| 2011/1/1 | 392 | 4.1% |
| 2020/1/1 | 386 | 4.0% |
| Other values (100) | 5139 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 25922 | |
| / | 19072 | |
| 0 | 11136 | |
| 2 | 10410 | |
| 9 | 3434 | 4.5% |
| 8 | 1392 | 1.8% |
| 7 | 1070 | 1.4% |
| 4 | 997 | 1.3% |
| 6 | 984 | 1.3% |
| 3 | 943 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 76291 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 25922 | |
| / | 19072 | |
| 0 | 11136 | |
| 2 | 10410 | |
| 9 | 3434 | 4.5% |
| 8 | 1392 | 1.8% |
| 7 | 1070 | 1.4% |
| 4 | 997 | 1.3% |
| 6 | 984 | 1.3% |
| 3 | 943 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 76291 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 25922 | |
| / | 19072 | |
| 0 | 11136 | |
| 2 | 10410 | |
| 9 | 3434 | 4.5% |
| 8 | 1392 | 1.8% |
| 7 | 1070 | 1.4% |
| 4 | 997 | 1.3% |
| 6 | 984 | 1.3% |
| 3 | 943 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 76291 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 25922 | |
| / | 19072 | |
| 0 | 11136 | |
| 2 | 10410 | |
| 9 | 3434 | 4.5% |
| 8 | 1392 | 1.8% |
| 7 | 1070 | 1.4% |
| 4 | 997 | 1.3% |
| 6 | 984 | 1.3% |
| 3 | 943 | 1.2% |
ENDDATE
Text
Missing
| Distinct | 183 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 2059 |
| Missing (%) | 18.1% |
| Memory size | 89.2 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 8 |
| Mean length | 8.376792978 |
| Min length | 8 |
Unique
| Unique | 45 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | 2026/1/1 |
|---|---|
| 2nd row | 2021/1/31 |
| 3rd row | 2019/1/31 |
| 4th row | 2005/1/1 |
| 5th row | 1998/1/1 |
| Value | Count | Frequency (%) |
| 2019/1/1 | 316 | 3.4% |
| 2020/1/1 | 307 | 3.3% |
| 2021/1/1 | 301 | 3.2% |
| 2018/1/1 | 287 | 3.1% |
| 2017/1/1 | 286 | 3.1% |
| 2016/1/1 | 283 | 3.0% |
| 2015/1/1 | 265 | 2.8% |
| 2023/1/1 | 247 | 2.6% |
| 2022/1/1 | 232 | 2.5% |
| 2012/1/1 | 232 | 2.5% |
| Other values (173) | 6586 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 24785 | |
| / | 18684 | |
| 2 | 11712 | |
| 0 | 10681 | |
| 3 | 4492 | 5.7% |
| 9 | 2849 | 3.6% |
| 8 | 1205 | 1.5% |
| 4 | 1000 | 1.3% |
| 7 | 971 | 1.2% |
| 5 | 968 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 78256 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 24785 | |
| / | 18684 | |
| 2 | 11712 | |
| 0 | 10681 | |
| 3 | 4492 | 5.7% |
| 9 | 2849 | 3.6% |
| 8 | 1205 | 1.5% |
| 4 | 1000 | 1.3% |
| 7 | 971 | 1.2% |
| 5 | 968 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 78256 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 24785 | |
| / | 18684 | |
| 2 | 11712 | |
| 0 | 10681 | |
| 3 | 4492 | 5.7% |
| 9 | 2849 | 3.6% |
| 8 | 1205 | 1.5% |
| 4 | 1000 | 1.3% |
| 7 | 971 | 1.2% |
| 5 | 968 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 78256 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 24785 | |
| / | 18684 | |
| 2 | 11712 | |
| 0 | 10681 | |
| 3 | 4492 | 5.7% |
| 9 | 2849 | 3.6% |
| 8 | 1205 | 1.5% |
| 4 | 1000 | 1.3% |
| 7 | 971 | 1.2% |
| 5 | 968 | 1.2% |
DEGREE
Text
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 89.2 KiB |
Length
| Max length | 11 |
|---|---|
| Median length | 9 |
| Mean length | 6.5850364 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Master |
|---|---|
| 2nd row | Bachelor |
| 3rd row | Doctor |
| 4th row | Doctor |
| 5th row | Doctor |
| Value | Count | Frequency (%) |
| bachelor | 3900 | |
| empty | 3768 | |
| master | 1585 | |
| doctor | 944 | 7.8% |
| high | 632 | 5.3% |
| school | 632 | 5.3% |
| mba | 373 | 3.1% |
| associate | 199 | 1.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 9452 | |
| o | 7251 | |
| t | 6496 | 8.7% |
| r | 6429 | 8.6% |
| a | 5684 | 7.6% |
| c | 5675 | 7.6% |
| h | 5164 | 6.9% |
| l | 4532 | 6.0% |
| B | 4273 | 5.7% |
| p | 3768 | 5.0% |
| Other values (11) | 16352 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 75076 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 9452 | |
| o | 7251 | |
| t | 6496 | 8.7% |
| r | 6429 | 8.6% |
| a | 5684 | 7.6% |
| c | 5675 | 7.6% |
| h | 5164 | 6.9% |
| l | 4532 | 6.0% |
| B | 4273 | 5.7% |
| p | 3768 | 5.0% |
| Other values (11) | 16352 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 75076 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 9452 | |
| o | 7251 | |
| t | 6496 | 8.7% |
| r | 6429 | 8.6% |
| a | 5684 | 7.6% |
| c | 5675 | 7.6% |
| h | 5164 | 6.9% |
| l | 4532 | 6.0% |
| B | 4273 | 5.7% |
| p | 3768 | 5.0% |
| Other values (11) | 16352 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 75076 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 9452 | |
| o | 7251 | |
| t | 6496 | 8.7% |
| r | 6429 | 8.6% |
| a | 5684 | 7.6% |
| c | 5675 | 7.6% |
| h | 5164 | 6.9% |
| l | 4532 | 6.0% |
| B | 4273 | 5.7% |
| p | 3768 | 5.0% |
| Other values (11) | 16352 |
FIELD
Text
| Distinct | 18 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 89.2 KiB |
Length
| Max length | 22 |
|---|---|
| Median length | 5 |
| Mean length | 6.43794404 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | empty |
|---|---|
| 2nd row | Business |
| 3rd row | Nursing |
| 4th row | Biology |
| 5th row | empty |
| Value | Count | Frequency (%) |
| empty | 6992 | |
| business | 1340 | 11.7% |
| engineering | 934 | 8.2% |
| law | 280 | 2.4% |
| economics | 262 | 2.3% |
| finance | 199 | 1.7% |
| biology | 187 | 1.6% |
| medicine | 183 | 1.6% |
| education | 182 | 1.6% |
| marketing | 153 | 1.3% |
| Other values (9) | 723 | 6.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 11428 | |
| t | 8141 | |
| m | 7474 | |
| y | 7400 | |
| p | 6992 | |
| n | 5745 | |
| i | 5277 | |
| s | 4793 | |
| g | 2444 | 3.3% |
| u | 1869 | 2.5% |
| Other values (23) | 11836 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 73399 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 11428 | |
| t | 8141 | |
| m | 7474 | |
| y | 7400 | |
| p | 6992 | |
| n | 5745 | |
| i | 5277 | |
| s | 4793 | |
| g | 2444 | 3.3% |
| u | 1869 | 2.5% |
| Other values (23) | 11836 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 73399 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 11428 | |
| t | 8141 | |
| m | 7474 | |
| y | 7400 | |
| p | 6992 | |
| n | 5745 | |
| i | 5277 | |
| s | 4793 | |
| g | 2444 | 3.3% |
| u | 1869 | 2.5% |
| Other values (23) | 11836 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 73399 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 11428 | |
| t | 8141 | |
| m | 7474 | |
| y | 7400 | |
| p | 6992 | |
| n | 5745 | |
| i | 5277 | |
| s | 4793 | |
| g | 2444 | 3.3% |
| u | 1869 | 2.5% |
| Other values (23) | 11836 |
DEGREE_RAW
Text
Missing
| Distinct | 2521 |
|---|---|
| Distinct (%) | 26.6% |
| Missing | 1941 |
| Missing (%) | 17.0% |
| Memory size | 89.2 KiB |
Length
| Max length | 100 |
|---|---|
| Median length | 89 |
| Mean length | 20.92230444 |
| Min length | 1 |
Unique
| Unique | 2085 ? |
|---|---|
| Unique (%) | 22.0% |
Sample
| 1st row | M.S. |
|---|---|
| 2nd row | Bachelor |
| 3rd row | Doctor of Philosophy - PhD |
| 4th row | Doctor of Philosophy (PhD) |
| 5th row | Doctor of Philosophy (Ph.D.) |
| Value | Count | Frequency (%) |
| of | 4150 | 13.1% |
| bachelor | 2399 | 7.6% |
| 2145 | 6.8% | |
| science | 1487 | 4.7% |
| degree | 1343 | 4.2% |
| arts | 1250 | 3.9% |
| master | 1143 | 3.6% |
| ba | 731 | 2.3% |
| bachelor's | 729 | 2.3% |
| bs | 671 | 2.1% |
| Other values (1914) | 15626 |
Most occurring characters
| Value | Count | Frequency (%) |
| 22229 | 11.2% | |
| e | 18368 | 9.3% |
| o | 14877 | 7.5% |
| r | 12121 | 6.1% |
| c | 10655 | 5.4% |
| a | 10374 | 5.2% |
| i | 10299 | 5.2% |
| s | 8891 | 4.5% |
| t | 8400 | 4.2% |
| n | 7757 | 3.9% |
| Other values (96) | 73954 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 197925 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 22229 | 11.2% | |
| e | 18368 | 9.3% |
| o | 14877 | 7.5% |
| r | 12121 | 6.1% |
| c | 10655 | 5.4% |
| a | 10374 | 5.2% |
| i | 10299 | 5.2% |
| s | 8891 | 4.5% |
| t | 8400 | 4.2% |
| n | 7757 | 3.9% |
| Other values (96) | 73954 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 197925 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 22229 | 11.2% | |
| e | 18368 | 9.3% |
| o | 14877 | 7.5% |
| r | 12121 | 6.1% |
| c | 10655 | 5.4% |
| a | 10374 | 5.2% |
| i | 10299 | 5.2% |
| s | 8891 | 4.5% |
| t | 8400 | 4.2% |
| n | 7757 | 3.9% |
| Other values (96) | 73954 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 197925 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 22229 | 11.2% | |
| e | 18368 | 9.3% |
| o | 14877 | 7.5% |
| r | 12121 | 6.1% |
| c | 10655 | 5.4% |
| a | 10374 | 5.2% |
| i | 10299 | 5.2% |
| s | 8891 | 4.5% |
| t | 8400 | 4.2% |
| n | 7757 | 3.9% |
| Other values (96) | 73954 |
FIELD_RAW
Text
Missing
| Distinct | 4129 |
|---|---|
| Distinct (%) | 48.8% |
| Missing | 2937 |
| Missing (%) | 25.8% |
| Memory size | 89.2 KiB |
Length
| Max length | 168 |
|---|---|
| Median length | 88 |
| Mean length | 24.81415406 |
| Min length | 1 |
Unique
| Unique | 3446 ? |
|---|---|
| Unique (%) | 40.7% |
Sample
| 1st row | Instructional Technology |
|---|---|
| 2nd row | Business Administration; Management |
| 3rd row | Nursing Science |
| 4th row | Biochemistry and Molecular Biology |
| 5th row | Geological and Earth Sciences/Geosciences |
| Value | Count | Frequency (%) |
| and | 2005 | 8.1% |
| science | 772 | 3.1% |
| 623 | 2.5% | |
| management | 557 | 2.3% |
| business | 539 | 2.2% |
| engineering | 516 | 2.1% |
| studies | 474 | 1.9% |
| computer | 420 | 1.7% |
| finance | 372 | 1.5% |
| economics | 336 | 1.4% |
| Other values (2567) | 18098 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 18862 | 9.0% |
| n | 18789 | 8.9% |
| e | 17929 | 8.5% |
| 16280 | 7.8% | |
| a | 14923 | 7.1% |
| o | 12328 | 5.9% |
| t | 11794 | 5.6% |
| c | 10359 | 4.9% |
| r | 10016 | 4.8% |
| s | 9424 | 4.5% |
| Other values (103) | 69323 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 210027 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 18862 | 9.0% |
| n | 18789 | 8.9% |
| e | 17929 | 8.5% |
| 16280 | 7.8% | |
| a | 14923 | 7.1% |
| o | 12328 | 5.9% |
| t | 11794 | 5.6% |
| c | 10359 | 4.9% |
| r | 10016 | 4.8% |
| s | 9424 | 4.5% |
| Other values (103) | 69323 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 210027 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 18862 | 9.0% |
| n | 18789 | 8.9% |
| e | 17929 | 8.5% |
| 16280 | 7.8% | |
| a | 14923 | 7.1% |
| o | 12328 | 5.9% |
| t | 11794 | 5.6% |
| c | 10359 | 4.9% |
| r | 10016 | 4.8% |
| s | 9424 | 4.5% |
| Other values (103) | 69323 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 210027 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 18862 | 9.0% |
| n | 18789 | 8.9% |
| e | 17929 | 8.5% |
| 16280 | 7.8% | |
| a | 14923 | 7.1% |
| o | 12328 | 5.9% |
| t | 11794 | 5.6% |
| c | 10359 | 4.9% |
| r | 10016 | 4.8% |
| s | 9424 | 4.5% |
| Other values (103) | 69323 |
RSID
Real number (ℝ)
Missing
| Distinct | 2929 |
|---|---|
| Distinct (%) | 28.4% |
| Missing | 1088 |
| Missing (%) | 9.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 119774.0406 |
| Minimum | 9 |
|---|---|
| Maximum | 296932 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 89.2 KiB |
Quantile statistics
| Minimum | 9 |
|---|---|
| 5-th percentile | 7837 |
| Q1 | 65045 |
| median | 115965 |
| Q3 | 155147 |
| 95-th percentile | 267350 |
| Maximum | 296932 |
| Range | 296923 |
| Interquartile range (IQR) | 90102 |
Descriptive statistics
| Standard deviation | 71806.62508 |
|---|---|
| Coefficient of variation (CV) | 0.5995174306 |
| Kurtosis | -0.1483616846 |
| Mean | 119774.0406 |
| Median Absolute Deviation (MAD) | 41405 |
| Skewness | 0.5624067994 |
| Sum | 1235229681 |
| Variance | 5156191406 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 98025 | 300 | 2.6% |
| 46455 | 231 | 2.0% |
| 47013 | 101 | 0.9% |
| 107574 | 90 | 0.8% |
| 116341 | 87 | 0.8% |
| 61156 | 87 | 0.8% |
| 66310 | 81 | 0.7% |
| 163889 | 72 | 0.6% |
| 107986 | 68 | 0.6% |
| 119646 | 67 | 0.6% |
| Other values (2919) | 9129 | |
| (Missing) | 1088 | 9.5% |
| Value | Count | Frequency (%) |
| 9 | 27 | |
| 37 | 1 | < 0.1% |
| 44 | 1 | < 0.1% |
| 58 | 1 | < 0.1% |
| 66 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 296932 | 1 | |
| 294291 | 1 | |
| 294288 | 1 | |
| 294241 | 1 | |
| 294210 | 1 |
ULTIMATE_PARENT_RSID
Real number (ℝ)
Missing
| Distinct | 2612 |
|---|---|
| Distinct (%) | 25.4% |
| Missing | 1119 |
| Missing (%) | 9.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 120285.8274 |
| Minimum | 9 |
|---|---|
| Maximum | 296932 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 89.2 KiB |
Quantile statistics
| Minimum | 9 |
|---|---|
| 5-th percentile | 22485 |
| Q1 | 66310 |
| median | 115959 |
| Q3 | 154166 |
| 95-th percentile | 265493.9 |
| Maximum | 296932 |
| Range | 296923 |
| Interquartile range (IQR) | 87856 |
Descriptive statistics
| Standard deviation | 68861.37824 |
|---|---|
| Coefficient of variation (CV) | 0.5724812286 |
| Kurtosis | 0.04971963483 |
| Mean | 120285.8274 |
| Median Absolute Deviation (MAD) | 39237 |
| Skewness | 0.6488442488 |
| Sum | 1236778877 |
| Variance | 4741889413 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 98025 | 448 | 3.9% |
| 46455 | 368 | 3.2% |
| 66310 | 170 | 1.5% |
| 47013 | 144 | 1.3% |
| 107573 | 139 | 1.2% |
| 155147 | 116 | 1.0% |
| 61766 | 108 | 0.9% |
| 163889 | 105 | 0.9% |
| 116341 | 92 | 0.8% |
| 61156 | 87 | 0.8% |
| Other values (2602) | 8505 | |
| (Missing) | 1119 | 9.8% |
| Value | Count | Frequency (%) |
| 9 | 35 | |
| 44 | 1 | < 0.1% |
| 66 | 2 | < 0.1% |
| 67 | 1 | < 0.1% |
| 68 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 296932 | 1 | |
| 294291 | 1 | |
| 294288 | 1 | |
| 294210 | 1 | |
| 294204 | 1 |
SCHOOL_PRESTIGE
Real number (ℝ)
Missing
| Distinct | 1423 |
|---|---|
| Distinct (%) | 17.0% |
| Missing | 3022 |
| Missing (%) | 26.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3374215692 |
| Minimum | -0.978193998 |
|---|---|
| Maximum | 0.986606002 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 3073 |
| Negative (%) | 27.0% |
| Memory size | 89.2 KiB |
Quantile statistics
| Minimum | -0.978193998 |
|---|---|
| 5-th percentile | -0.808278024 |
| Q1 | -0.564356029 |
| median | 0.918618023 |
| Q3 | 0.960698009 |
| 95-th percentile | 0.968349993 |
| Maximum | 0.986606002 |
| Range | 1.9648 |
| Interquartile range (IQR) | 1.525054038 |
Descriptive statistics
| Standard deviation | 0.7570980225 |
|---|---|
| Coefficient of variation (CV) | 2.243774825 |
| Kurtosis | -1.615280487 |
| Mean | 0.3374215692 |
| Median Absolute Deviation (MAD) | 0.049571991 |
| Skewness | -0.5176016895 |
| Sum | 2827.255329 |
| Variance | 0.5731974157 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.960698009 | 449 | 3.9% |
| 0.963714004 | 379 | 3.3% |
| 0.967827976 | 178 | 1.6% |
| 0.956879973 | 144 | 1.3% |
| 0.940025985 | 131 | 1.1% |
| 0.964698017 | 116 | 1.0% |
| -0.453042001 | 108 | 0.9% |
| 0.967414022 | 105 | 0.9% |
| 0.924655974 | 92 | 0.8% |
| -0.638598025 | 87 | 0.8% |
| Other values (1413) | 6590 | |
| (Missing) | 3022 |
| Value | Count | Frequency (%) |
| -0.978193998 | 2 | |
| -0.974005997 | 3 | |
| -0.969403982 | 1 | < 0.1% |
| -0.967266023 | 1 | < 0.1% |
| -0.965373993 | 2 |
| Value | Count | Frequency (%) |
| 0.986606002 | 2 | < 0.1% |
| 0.984456003 | 6 | 0.1% |
| 0.984380007 | 15 | |
| 0.984134018 | 2 | < 0.1% |
| 0.98388797 | 6 | 0.1% |
CAMPUS_CLEANED
Text
| Distinct | 4243 |
|---|---|
| Distinct (%) | 37.2% |
| Missing | 2 |
| Missing (%) | < 0.1% |
| Memory size | 89.2 KiB |
Length
| Max length | 127 |
|---|---|
| Median length | 88 |
| Mean length | 27.30116677 |
| Min length | 2 |
Unique
| Unique | 2927 ? |
|---|---|
| Unique (%) | 25.7% |
Sample
| 1st row | east carolina university |
|---|---|
| 2nd row | the state university of new york at canton |
| 3rd row | the university of tennessee health science center |
| 4th row | university of nebraska medical center |
| 5th row | northwestern university |
| Value | Count | Frequency (%) |
| university | 6046 | 13.7% |
| of | 4118 | 9.3% |
| school | 2762 | 6.3% |
| college | 1678 | 3.8% |
| new | 1054 | 2.4% |
| the | 973 | 2.2% |
| high | 806 | 1.8% |
| york | 782 | 1.8% |
| institute | 569 | 1.3% |
| business | 515 | 1.2% |
| Other values (4205) | 24753 |
Most occurring characters
| Value | Count | Frequency (%) |
| 32657 | 10.5% | |
| e | 27124 | 8.7% |
| i | 26987 | 8.7% |
| o | 23455 | 7.5% |
| n | 22516 | 7.2% |
| s | 20551 | 6.6% |
| t | 19495 | 6.3% |
| r | 17306 | 5.6% |
| a | 16922 | 5.4% |
| l | 14728 | 4.7% |
| Other values (71) | 89465 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 311206 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 32657 | 10.5% | |
| e | 27124 | 8.7% |
| i | 26987 | 8.7% |
| o | 23455 | 7.5% |
| n | 22516 | 7.2% |
| s | 20551 | 6.6% |
| t | 19495 | 6.3% |
| r | 17306 | 5.6% |
| a | 16922 | 5.4% |
| l | 14728 | 4.7% |
| Other values (71) | 89465 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 311206 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 32657 | 10.5% | |
| e | 27124 | 8.7% |
| i | 26987 | 8.7% |
| o | 23455 | 7.5% |
| n | 22516 | 7.2% |
| s | 20551 | 6.6% |
| t | 19495 | 6.3% |
| r | 17306 | 5.6% |
| a | 16922 | 5.4% |
| l | 14728 | 4.7% |
| Other values (71) | 89465 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 311206 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 32657 | 10.5% | |
| e | 27124 | 8.7% |
| i | 26987 | 8.7% |
| o | 23455 | 7.5% |
| n | 22516 | 7.2% |
| s | 20551 | 6.6% |
| t | 19495 | 6.3% |
| r | 17306 | 5.6% |
| a | 16922 | 5.4% |
| l | 14728 | 4.7% |
| Other values (71) | 89465 |
CAMPUS_COUNTRY
Text
Missing
| Distinct | 70 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 2576 |
| Missing (%) | 22.6% |
| Memory size | 89.2 KiB |
Length
| Max length | 20 |
|---|---|
| Median length | 13 |
| Mean length | 12.19082153 |
| Min length | 4 |
Unique
| Unique | 17 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | United States |
|---|---|
| 2nd row | United States |
| 3rd row | United States |
| 4th row | Philippines |
| 5th row | Sweden |
| Value | Count | Frequency (%) |
| united | 7640 | |
| states | 7410 | |
| kingdom | 229 | 1.4% |
| india | 192 | 1.2% |
| canada | 117 | 0.7% |
| france | 69 | 0.4% |
| italy | 68 | 0.4% |
| spain | 64 | 0.4% |
| australia | 61 | 0.4% |
| china | 54 | 0.3% |
| Other values (67) | 645 | 3.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 22755 | |
| e | 15551 | |
| a | 8828 | 8.2% |
| n | 8731 | 8.1% |
| i | 8576 | 8.0% |
| d | 8296 | 7.7% |
| 7724 | 7.2% | |
| s | 7644 | 7.1% |
| U | 7641 | 7.1% |
| S | 7583 | 7.0% |
| Other values (35) | 4255 | 4.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 107584 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 22755 | |
| e | 15551 | |
| a | 8828 | 8.2% |
| n | 8731 | 8.1% |
| i | 8576 | 8.0% |
| d | 8296 | 7.7% |
| 7724 | 7.2% | |
| s | 7644 | 7.1% |
| U | 7641 | 7.1% |
| S | 7583 | 7.0% |
| Other values (35) | 4255 | 4.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 107584 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 22755 | |
| e | 15551 | |
| a | 8828 | 8.2% |
| n | 8731 | 8.1% |
| i | 8576 | 8.0% |
| d | 8296 | 7.7% |
| 7724 | 7.2% | |
| s | 7644 | 7.1% |
| U | 7641 | 7.1% |
| S | 7583 | 7.0% |
| Other values (35) | 4255 | 4.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 107584 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 22755 | |
| e | 15551 | |
| a | 8828 | 8.2% |
| n | 8731 | 8.1% |
| i | 8576 | 8.0% |
| d | 8296 | 7.7% |
| 7724 | 7.2% | |
| s | 7644 | 7.1% |
| U | 7641 | 7.1% |
| S | 7583 | 7.0% |
| Other values (35) | 4255 | 4.0% |
LOCATION_COUNTRY
Text
Missing
| Distinct | 62 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 6084 |
| Missing (%) | 53.4% |
| Memory size | 89.2 KiB |
Length
| Max length | 20 |
|---|---|
| Median length | 13 |
| Mean length | 12.20199361 |
| Min length | 4 |
Unique
| Unique | 16 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | United States |
|---|---|
| 2nd row | United States |
| 3rd row | United States |
| 4th row | Philippines |
| 5th row | United States |
| Value | Count | Frequency (%) |
| united | 4604 | |
| states | 4423 | |
| kingdom | 180 | 1.8% |
| china | 97 | 1.0% |
| canada | 94 | 0.9% |
| india | 59 | 0.6% |
| australia | 44 | 0.4% |
| south | 36 | 0.4% |
| italy | 33 | 0.3% |
| israel | 30 | 0.3% |
| Other values (59) | 381 | 3.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 13623 | |
| e | 9239 | |
| a | 5384 | 8.3% |
| n | 5280 | 8.1% |
| i | 5251 | 8.1% |
| d | 4977 | 7.7% |
| 4664 | 7.2% | |
| U | 4605 | 7.1% |
| s | 4566 | 7.0% |
| S | 4491 | 6.9% |
| Other values (35) | 2798 | 4.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 64878 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 13623 | |
| e | 9239 | |
| a | 5384 | 8.3% |
| n | 5280 | 8.1% |
| i | 5251 | 8.1% |
| d | 4977 | 7.7% |
| 4664 | 7.2% | |
| U | 4605 | 7.1% |
| s | 4566 | 7.0% |
| S | 4491 | 6.9% |
| Other values (35) | 2798 | 4.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 64878 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 13623 | |
| e | 9239 | |
| a | 5384 | 8.3% |
| n | 5280 | 8.1% |
| i | 5251 | 8.1% |
| d | 4977 | 7.7% |
| 4664 | 7.2% | |
| U | 4605 | 7.1% |
| s | 4566 | 7.0% |
| S | 4491 | 6.9% |
| Other values (35) | 2798 | 4.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 64878 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 13623 | |
| e | 9239 | |
| a | 5384 | 8.3% |
| n | 5280 | 8.1% |
| i | 5251 | 8.1% |
| d | 4977 | 7.7% |
| 4664 | 7.2% | |
| U | 4605 | 7.1% |
| s | 4566 | 7.0% |
| S | 4491 | 6.9% |
| Other values (35) | 2798 | 4.3% |
UNIVERSITY_COUNTRY
Text
Missing
| Distinct | 69 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 6498 |
| Missing (%) | 57.0% |
| Memory size | 89.2 KiB |
Length
| Max length | 21 |
|---|---|
| Median length | 13 |
| Mean length | 11.83805833 |
| Min length | 4 |
Unique
| Unique | 14 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | United States |
|---|---|
| 2nd row | United States |
| 3rd row | United States |
| 4th row | United States |
| 5th row | United States |
| Value | Count | Frequency (%) |
| united | 3991 | |
| states | 3893 | |
| china | 108 | 1.2% |
| kingdom | 97 | 1.1% |
| india | 97 | 1.1% |
| canada | 70 | 0.8% |
| australia | 42 | 0.5% |
| france | 40 | 0.4% |
| spain | 37 | 0.4% |
| italy | 36 | 0.4% |
| Other values (71) | 542 | 6.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 11964 | |
| e | 8230 | |
| a | 5064 | |
| n | 4769 | 8.2% |
| i | 4673 | 8.1% |
| d | 4325 | 7.5% |
| 4050 | 7.0% | |
| s | 4038 | 7.0% |
| U | 3996 | 6.9% |
| S | 3963 | 6.8% |
| Other values (36) | 2970 | 5.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 58042 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 11964 | |
| e | 8230 | |
| a | 5064 | |
| n | 4769 | 8.2% |
| i | 4673 | 8.1% |
| d | 4325 | 7.5% |
| 4050 | 7.0% | |
| s | 4038 | 7.0% |
| U | 3996 | 6.9% |
| S | 3963 | 6.8% |
| Other values (36) | 2970 | 5.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 58042 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 11964 | |
| e | 8230 | |
| a | 5064 | |
| n | 4769 | 8.2% |
| i | 4673 | 8.1% |
| d | 4325 | 7.5% |
| 4050 | 7.0% | |
| s | 4038 | 7.0% |
| U | 3996 | 6.9% |
| S | 3963 | 6.8% |
| Other values (36) | 2970 | 5.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 58042 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 11964 | |
| e | 8230 | |
| a | 5064 | |
| n | 4769 | 8.2% |
| i | 4673 | 8.1% |
| d | 4325 | 7.5% |
| 4050 | 7.0% | |
| s | 4038 | 7.0% |
| U | 3996 | 6.9% |
| S | 3963 | 6.8% |
| Other values (36) | 2970 | 5.1% |
IS_BAD_USER
Boolean
Imbalance
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.3 KiB |
| False | |
|---|---|
| True | 3 |
| Value | Count | Frequency (%) |
| False | 11398 | |
| True | 3 | < 0.1% |
IS_PLATFORM_USER
Boolean
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.3 KiB |
| False |
|---|
| Value | Count | Frequency (%) |
| False | 11401 |
URI
Text
| Distinct | 4521 |
|---|---|
| Distinct (%) | 39.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 89.2 KiB |
Length
| Max length | 87 |
|---|---|
| Median length | 62 |
| Mean length | 33.93167266 |
| Min length | 19 |
Unique
| Unique | 1031 ? |
|---|---|
| Unique (%) | 9.0% |
Sample
| 1st row | linkedin.com/in/bscobb |
|---|---|
| 2nd row | linkedin.com/in/kashifrivers |
| 3rd row | linkedin.com/in/ulanda-marcus-aiyeku-dnp-pmhnp-bc-ne-bc-840352124 |
| 4th row | linkedin.com/in/pranita-atri-76490773 |
| 5th row | linkedin.com/in/eddie-brooks-0b01289b |
| Value | Count | Frequency (%) |
| linkedin.com/in/hasantimucinozdemir | 39 | 0.3% |
| linkedin.com/in/jillchasse | 19 | 0.2% |
| linkedin.com/in/vesteragerm | 13 | 0.1% |
| linkedin.com/in/chloe-luterman | 13 | 0.1% |
| linkedin.com/in/dominic-desapio-07774a62 | 13 | 0.1% |
| linkedin.com/in/andraya-yearwood-9b3859182 | 13 | 0.1% |
| linkedin.com/in/oliverknesl | 12 | 0.1% |
| linkedin.com/in/louis-l-nock-83a01734 | 11 | 0.1% |
| linkedin.com/in/andraya-y-9b3859182 | 11 | 0.1% |
| linkedin.com/in/pietro-nardella-dellova-aa25042a | 11 | 0.1% |
| Other values (4511) | 11246 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 46027 | 11.9% |
| i | 45395 | 11.7% |
| e | 25134 | 6.5% |
| a | 22865 | 5.9% |
| / | 22802 | 5.9% |
| l | 20094 | 5.2% |
| o | 19272 | 5.0% |
| m | 17120 | 4.4% |
| c | 16264 | 4.2% |
| d | 16018 | 4.1% |
| Other values (50) | 135864 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 386855 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 46027 | 11.9% |
| i | 45395 | 11.7% |
| e | 25134 | 6.5% |
| a | 22865 | 5.9% |
| / | 22802 | 5.9% |
| l | 20094 | 5.2% |
| o | 19272 | 5.0% |
| m | 17120 | 4.4% |
| c | 16264 | 4.2% |
| d | 16018 | 4.1% |
| Other values (50) | 135864 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 386855 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 46027 | 11.9% |
| i | 45395 | 11.7% |
| e | 25134 | 6.5% |
| a | 22865 | 5.9% |
| / | 22802 | 5.9% |
| l | 20094 | 5.2% |
| o | 19272 | 5.0% |
| m | 17120 | 4.4% |
| c | 16264 | 4.2% |
| d | 16018 | 4.1% |
| Other values (50) | 135864 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 386855 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 46027 | 11.9% |
| i | 45395 | 11.7% |
| e | 25134 | 6.5% |
| a | 22865 | 5.9% |
| / | 22802 | 5.9% |
| l | 20094 | 5.2% |
| o | 19272 | 5.0% |
| m | 17120 | 4.4% |
| c | 16264 | 4.2% |
| d | 16018 | 4.1% |
| Other values (50) | 135864 |
EDUCATION_ID
Real number (ℝ)
| Distinct | 11380 |
|---|---|
| Distinct (%) | 99.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -9.216409477 × 1015 |
| Minimum | -9.22169 × 1018 |
|---|---|
| Maximum | 9.222084882 × 1018 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 5730 |
| Negative (%) | 50.3% |
| Memory size | 89.2 KiB |
Quantile statistics
| Minimum | -9.22169 × 1018 |
|---|---|
| 5-th percentile | -8.33183 × 1018 |
| Q1 | -4.69769 × 1018 |
| median | -3.12178 × 1016 |
| Q3 | 4.630562234 × 1018 |
| 95-th percentile | 8.342237691 × 1018 |
| Maximum | 9.222084882 × 1018 |
| Range | 1.844377488 × 1019 |
| Interquartile range (IQR) | 9.328252234 × 1018 |
Descriptive statistics
| Standard deviation | 5.350778188 × 1018 |
|---|---|
| Coefficient of variation (CV) | -580.5707962 |
| Kurtosis | -1.202570239 |
| Mean | -9.216409477 × 1015 |
| Median Absolute Deviation (MAD) | 4.663510893 × 1018 |
| Skewness | 0.007905324779 |
| Sum | -1.050762845 × 1020 |
| Variance | 2.863082722 × 1037 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -1.20821 × 1018 | 2 | < 0.1% |
| -5.18579 × 1018 | 2 | < 0.1% |
| -3.76894 × 1018 | 2 | < 0.1% |
| -7.9522 × 1018 | 2 | < 0.1% |
| -3.95649 × 1018 | 2 | < 0.1% |
| -5.9951 × 1018 | 2 | < 0.1% |
| -2.50361 × 1018 | 2 | < 0.1% |
| -8.62531 × 1018 | 2 | < 0.1% |
| -8.90762 × 1018 | 2 | < 0.1% |
| -8.34148 × 1018 | 2 | < 0.1% |
| Other values (11370) | 11381 |
| Value | Count | Frequency (%) |
| -9.22169 × 1018 | 1 | |
| -9.2214 × 1018 | 1 | |
| -9.22038 × 1018 | 1 | |
| -9.21985 × 1018 | 1 | |
| -9.2193 × 1018 | 1 |
| Value | Count | Frequency (%) |
| 9.222084882 × 1018 | 1 | |
| 9.221546001 × 1018 | 1 | |
| 9.219007644 × 1018 | 1 | |
| 9.217667956 × 1018 | 1 | |
| 9.217205312 × 1018 | 1 |
DEGREE_LEVEL
Real number (ℝ)
Zeros
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.333040961 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 3768 |
| Zeros (%) | 33.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 89.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 1.945314737 |
|---|---|
| Coefficient of variation (CV) | 0.8338107944 |
| Kurtosis | -1.062920488 |
| Mean | 2.333040961 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.1521594592 |
| Sum | 26599 |
| Variance | 3.784249427 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 3900 | |
| 0 | 3768 | |
| 4 | 1585 | |
| 6 | 944 | 8.3% |
| 1 | 632 | 5.5% |
| 5 | 373 | 3.3% |
| 2 | 199 | 1.7% |
| Value | Count | Frequency (%) |
| 0 | 3768 | |
| 1 | 632 | 5.5% |
| 2 | 199 | 1.7% |
| 3 | 3900 | |
| 4 | 1585 |
| Value | Count | Frequency (%) |
| 6 | 944 | 8.3% |
| 5 | 373 | 3.3% |
| 4 | 1585 | |
| 3 | 3900 | |
| 2 | 199 | 1.7% |
SEQUENCENO
Real number (ℝ)
| Distinct | 39 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.241557758 |
| Minimum | 1 |
|---|---|
| Maximum | 39 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 89.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 39 |
| Range | 38 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.927564365 |
|---|---|
| Coefficient of variation (CV) | 0.8599217922 |
| Kurtosis | 93.63555805 |
| Mean | 2.241557758 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 7.01398027 |
| Sum | 25556 |
| Variance | 3.71550438 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 4509 | |
| 2 | 3458 | |
| 3 | 1863 | |
| 4 | 816 | 7.2% |
| 5 | 354 | 3.1% |
| 6 | 160 | 1.4% |
| 7 | 87 | 0.8% |
| 8 | 47 | 0.4% |
| 9 | 30 | 0.3% |
| 10 | 18 | 0.2% |
| Other values (29) | 59 | 0.5% |
| Value | Count | Frequency (%) |
| 1 | 4509 | |
| 2 | 3458 | |
| 3 | 1863 | |
| 4 | 816 | 7.2% |
| 5 | 354 | 3.1% |
| Value | Count | Frequency (%) |
| 39 | 1 | |
| 38 | 1 | |
| 37 | 1 | |
| 36 | 1 | |
| 35 | 1 |
UNIVERSITYURL
Text
Missing
| Distinct | 3371 |
|---|---|
| Distinct (%) | 36.3% |
| Missing | 2104 |
| Missing (%) | 18.5% |
| Memory size | 89.2 KiB |
Length
| Max length | 338 |
|---|---|
| Median length | 266 |
| Mean length | 48.26632247 |
| Min length | 36 |
Unique
| Unique | 2068 ? |
|---|---|
| Unique (%) | 22.2% |
Sample
| 1st row | https://www.linkedin.com/company/7522/ |
|---|---|
| 2nd row | https://www.linkedin.com/company/230771/ |
| 3rd row | https://www.linkedin.com/school/northwestern-university/ |
| 4th row | https://www.linkedin.com/company/5077/ |
| 5th row | https://www.linkedin.com/school/university-of-gothenburg/ |
| Value | Count | Frequency (%) |
| https://www.linkedin.com/school/new-york-university | 159 | 1.7% |
| https://www.linkedin.com/school/columbia-university | 150 | 1.6% |
| https://www.linkedin.com/company/3159 | 117 | 1.3% |
| https://www.linkedin.com/company/2624 | 82 | 0.9% |
| https://www.linkedin.com/school/cornell-university | 63 | 0.7% |
| https://www.linkedin.com/school/fashion-institute-of-technology | 51 | 0.5% |
| https://www.linkedin.com/company/4262 | 51 | 0.5% |
| https://www.linkedin.com/company/7201 | 48 | 0.5% |
| https://www.linkedin.com/school/yale-university | 44 | 0.5% |
| https://www.linkedin.com/company/7338 | 36 | 0.4% |
| Other values (3361) | 8496 |
Most occurring characters
| Value | Count | Frequency (%) |
| / | 46484 | 10.4% |
| o | 31827 | 7.1% |
| n | 31156 | 6.9% |
| w | 28886 | 6.4% |
| i | 28502 | 6.4% |
| t | 25415 | 5.7% |
| c | 22802 | 5.1% |
| s | 21948 | 4.9% |
| l | 19906 | 4.4% |
| e | 18998 | 4.2% |
| Other values (40) | 172808 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 448732 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| / | 46484 | 10.4% |
| o | 31827 | 7.1% |
| n | 31156 | 6.9% |
| w | 28886 | 6.4% |
| i | 28502 | 6.4% |
| t | 25415 | 5.7% |
| c | 22802 | 5.1% |
| s | 21948 | 4.9% |
| l | 19906 | 4.4% |
| e | 18998 | 4.2% |
| Other values (40) | 172808 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 448732 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| / | 46484 | 10.4% |
| o | 31827 | 7.1% |
| n | 31156 | 6.9% |
| w | 28886 | 6.4% |
| i | 28502 | 6.4% |
| t | 25415 | 5.7% |
| c | 22802 | 5.1% |
| s | 21948 | 4.9% |
| l | 19906 | 4.4% |
| e | 18998 | 4.2% |
| Other values (40) | 172808 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 448732 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| / | 46484 | 10.4% |
| o | 31827 | 7.1% |
| n | 31156 | 6.9% |
| w | 28886 | 6.4% |
| i | 28502 | 6.4% |
| t | 25415 | 5.7% |
| c | 22802 | 5.1% |
| s | 21948 | 4.9% |
| l | 19906 | 4.4% |
| e | 18998 | 4.2% |
| Other values (40) | 172808 |
UNIVERSITYURI
Text
Missing
| Distinct | 3403 |
|---|---|
| Distinct (%) | 36.5% |
| Missing | 2065 |
| Missing (%) | 18.1% |
| Memory size | 89.2 KiB |
Length
| Max length | 325 |
|---|---|
| Median length | 253 |
| Mean length | 35.30676949 |
| Min length | 23 |
Unique
| Unique | 2094 ? |
|---|---|
| Unique (%) | 22.4% |
Sample
| 1st row | linkedin.com/company/7522 |
|---|---|
| 2nd row | linkedin.com/company/230771 |
| 3rd row | linkedin.com/school/northwestern-university |
| 4th row | linkedin.com/company/5077 |
| 5th row | linkedin.com/school/university-of-gothenburg |
| Value | Count | Frequency (%) |
| linkedin.com/school/new-york-university | 161 | 1.7% |
| linkedin.com/school/columbia-university | 150 | 1.6% |
| linkedin.com/company/3159 | 117 | 1.3% |
| linkedin.com/company/2624 | 82 | 0.9% |
| linkedin.com/school/cornell-university | 65 | 0.7% |
| linkedin.com/company/4262 | 51 | 0.5% |
| linkedin.com/school/fashion-institute-of-technology | 51 | 0.5% |
| linkedin.com/company/7201 | 48 | 0.5% |
| linkedin.com/school/yale-university | 45 | 0.5% |
| linkedin.com/school/harvard-university | 37 | 0.4% |
| Other values (3371) | 8529 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 32016 | 9.7% |
| n | 31306 | 9.5% |
| i | 28660 | 8.7% |
| c | 22915 | 7.0% |
| l | 20040 | 6.1% |
| e | 19121 | 5.8% |
| / | 18711 | 5.7% |
| m | 15524 | 4.7% |
| s | 12743 | 3.9% |
| d | 11095 | 3.4% |
| Other values (39) | 117493 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 329624 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 32016 | 9.7% |
| n | 31306 | 9.5% |
| i | 28660 | 8.7% |
| c | 22915 | 7.0% |
| l | 20040 | 6.1% |
| e | 19121 | 5.8% |
| / | 18711 | 5.7% |
| m | 15524 | 4.7% |
| s | 12743 | 3.9% |
| d | 11095 | 3.4% |
| Other values (39) | 117493 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 329624 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 32016 | 9.7% |
| n | 31306 | 9.5% |
| i | 28660 | 8.7% |
| c | 22915 | 7.0% |
| l | 20040 | 6.1% |
| e | 19121 | 5.8% |
| / | 18711 | 5.7% |
| m | 15524 | 4.7% |
| s | 12743 | 3.9% |
| d | 11095 | 3.4% |
| Other values (39) | 117493 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 329624 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 32016 | 9.7% |
| n | 31306 | 9.5% |
| i | 28660 | 8.7% |
| c | 22915 | 7.0% |
| l | 20040 | 6.1% |
| e | 19121 | 5.8% |
| / | 18711 | 5.7% |
| m | 15524 | 4.7% |
| s | 12743 | 3.9% |
| d | 11095 | 3.4% |
| Other values (39) | 117493 |
UNIVERSITY_LOCATION
Text
Missing
| Distinct | 847 |
|---|---|
| Distinct (%) | 15.9% |
| Missing | 6076 |
| Missing (%) | 53.3% |
| Memory size | 89.2 KiB |
Length
| Max length | 93 |
|---|---|
| Median length | 46 |
| Mean length | 14.75173709 |
| Min length | 3 |
Unique
| Unique | 342 ? |
|---|---|
| Unique (%) | 6.4% |
Sample
| 1st row | Greenville, NC |
|---|---|
| 2nd row | Memphis, Tennessee |
| 3rd row | Omaha, Nebraska |
| 4th row | Laoag, Ilocos Norte |
| 5th row | New York, NY |
| Value | Count | Frequency (%) |
| ny | 1326 | 10.4% |
| new | 1090 | 8.6% |
| york | 897 | 7.0% |
| nj | 325 | 2.6% |
| pa | 246 | 1.9% |
| ca | 212 | 1.7% |
| ma | 210 | 1.6% |
| chicago | 145 | 1.1% |
| massachusetts | 139 | 1.1% |
| cambridge | 134 | 1.1% |
| Other values (983) | 8014 |
Most occurring characters
| Value | Count | Frequency (%) |
| 7414 | 9.4% | |
| e | 5353 | 6.8% |
| a | 5214 | 6.6% |
| , | 5182 | 6.6% |
| n | 5002 | 6.4% |
| o | 5002 | 6.4% |
| r | 3986 | 5.1% |
| i | 3599 | 4.6% |
| t | 2955 | 3.8% |
| N | 2765 | 3.5% |
| Other values (85) | 32081 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 78553 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 7414 | 9.4% | |
| e | 5353 | 6.8% |
| a | 5214 | 6.6% |
| , | 5182 | 6.6% |
| n | 5002 | 6.4% |
| o | 5002 | 6.4% |
| r | 3986 | 5.1% |
| i | 3599 | 4.6% |
| t | 2955 | 3.8% |
| N | 2765 | 3.5% |
| Other values (85) | 32081 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 78553 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 7414 | 9.4% | |
| e | 5353 | 6.8% |
| a | 5214 | 6.6% |
| , | 5182 | 6.6% |
| n | 5002 | 6.4% |
| o | 5002 | 6.4% |
| r | 3986 | 5.1% |
| i | 3599 | 4.6% |
| t | 2955 | 3.8% |
| N | 2765 | 3.5% |
| Other values (85) | 32081 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 78553 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 7414 | 9.4% | |
| e | 5353 | 6.8% |
| a | 5214 | 6.6% |
| , | 5182 | 6.6% |
| n | 5002 | 6.4% |
| o | 5002 | 6.4% |
| r | 3986 | 5.1% |
| i | 3599 | 4.6% |
| t | 2955 | 3.8% |
| N | 2765 | 3.5% |
| Other values (85) | 32081 |
Missing
| Distinct | 3827 |
|---|---|
| Distinct (%) | 96.9% |
| Missing | 7453 |
| Missing (%) | 65.4% |
| Memory size | 89.2 KiB |
Length
| Max length | 1590 |
|---|---|
| Median length | 726 |
| Mean length | 168.9898683 |
| Min length | 3 |
Unique
| Unique | 3763 ? |
|---|---|
| Unique (%) | 95.3% |
Sample
| 1st row | Activities and Societies: The Golden Key International Honour Society |
|---|---|
| 2nd row | Activities and Societies: President, Graduate Student Association (2020-2021) Graduate Student Senator, UNMC student senate (2020-2021) Vice President, Graduate Student Association (2019-2020) Treasurer, International Student Association (2018-2019) |
| 3rd row | Ph.D. research explores earthquake hazard maps, how to assess their performance, and measuring uncertainties in their calculations. We hope to address the questions of why maps sometimes fail with disastrous consequences (such as Tohoku 2011, Haiti 2011, or Nepal 2015), and suggest ways to improve their generation and performance. |
| 4th row | Activities and Societies: A member of the Presidential Scholars |
| 5th row | Coursework covers univariate and multivariate analysis. Additional focus has been placed on machine learning, time series analysis, computing and experimental design. |
| Value | Count | Frequency (%) |
| and | 4689 | 5.1% |
| of | 2602 | 2.8% |
| the | 2386 | 2.6% |
| 2383 | 2.6% | |
| in | 2111 | 2.3% |
| activities | 1389 | 1.5% |
| societies | 1363 | 1.5% |
| for | 978 | 1.1% |
| to | 975 | 1.1% |
| a | 951 | 1.0% |
| Other values (12545) | 72512 |
Most occurring characters
| Value | Count | Frequency (%) |
| 88250 | 13.2% | |
| e | 56286 | 8.4% |
| i | 47919 | 7.2% |
| a | 42809 | 6.4% |
| t | 41487 | 6.2% |
| n | 41260 | 6.2% |
| o | 37837 | 5.7% |
| r | 33015 | 4.9% |
| s | 31756 | 4.8% |
| c | 21919 | 3.3% |
| Other values (114) | 224634 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 667172 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 88250 | 13.2% | |
| e | 56286 | 8.4% |
| i | 47919 | 7.2% |
| a | 42809 | 6.4% |
| t | 41487 | 6.2% |
| n | 41260 | 6.2% |
| o | 37837 | 5.7% |
| r | 33015 | 4.9% |
| s | 31756 | 4.8% |
| c | 21919 | 3.3% |
| Other values (114) | 224634 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 667172 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 88250 | 13.2% | |
| e | 56286 | 8.4% |
| i | 47919 | 7.2% |
| a | 42809 | 6.4% |
| t | 41487 | 6.2% |
| n | 41260 | 6.2% |
| o | 37837 | 5.7% |
| r | 33015 | 4.9% |
| s | 31756 | 4.8% |
| c | 21919 | 3.3% |
| Other values (114) | 224634 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 667172 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 88250 | 13.2% | |
| e | 56286 | 8.4% |
| i | 47919 | 7.2% |
| a | 42809 | 6.4% |
| t | 41487 | 6.2% |
| n | 41260 | 6.2% |
| o | 37837 | 5.7% |
| r | 33015 | 4.9% |
| s | 31756 | 4.8% |
| c | 21919 | 3.3% |
| Other values (114) | 224634 |